Improving Mapreduce for Incremental Processing Using Map Data Storage
نویسندگان
چکیده
منابع مشابه
Large-scale incremental processing with MapReduce
An important property of today’s big data processing is that the same computation is often repeated on datasets evolving over time, such as web and social network data. While repeating full computation of the entire datasets is feasible with distributed computing frameworks such as Hadoop, it is obviously inefficient and wastes resources. In this paper, we present HadUP (Hadoop with Update Proc...
متن کاملImproving Node-Level MapReduce Performance Using Processing-in-Memory Technologies
Processing-in-Memory (PIM) is the concept of moving computation as close as possible to memory. This decreases the need for the movement of data between central processor and memory system, hence improves energy efficiency from the reduced memory traffic. In this paper we present our approach on how to embed processing cores in 3D-stacked memories, and evaluate the use of such a system for Big ...
متن کاملMatching-Based Allocation Strategies for Improving Data Locality of Map Tasks in MapReduce
MapReduce is a well-know framework for distributing data-processing computations on parallel clusters. In MapReduce, a large computation is broken into small tasks that run in parallel on multiple machines, and scales easily to very large clusters of inexpensive commodity computers. Before the Map phase, the original dataset is first split into chunks, that are replicated (a constant number of ...
متن کاملiMapReduce: Incremental MapReduce for Mining Evolving Big Data
As new data and updates are constantly arriving, the results of data mining applications become stale and obsolete over time. Incremental processing is a promising approach to refreshing mining results. It utilizes previously saved states to avoid the expense of re-computation from scratch. In this paper, we propose i2MapReduce, a novel incremental processing extension to MapReduce, the most wi...
متن کاملParallel Data Processing in Dynamic Hybrid Computing Environment Using MapReduce
A novel MapReduce computation model in hybrid computing environment called HybridMR is proposed in the paper. Using this model, high performance cluster nodes and heterogeneous desktop PCs in Internet or Intranet can be integrated to form a hybrid computing environment. In this way, the computation and storage capability of largescale desktop PCs can be fully utilized to process large-scale dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Procedia Computer Science
سال: 2016
ISSN: 1877-0509
DOI: 10.1016/j.procs.2016.05.163